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AN EXPOSITION TO INFORMATION PERCOLATION FOR THE ISING MODEL 


EYAL LUBETZKY AND ALLAN SLY 


Abstract. Information percolation is a new method for analyzing stochastic spin systems through 
classifying and controlling the clusters of information-flow in the space-time slab. It yielded sharp 
mixing estimates (cutoff with an O(l)-window) for the Ising model on Z d up to the critical temperature, 
as well as results on the effect of initial conditions on mixing. In this expository note we demonstrate 
the method on lattices (more generally, on any locally-finite transitive graph) at very high temperatures. 

1. Introduction 

The Ising model on a finite graph G with vertex-set V and edge-set E is a distribution over the set 
of configurations = {±1}^; each a E J7 is an assignment of plus/minus spins to the sites in V, and 
the probability of a 6 II is given by the Gibbs distribution 

tt(o-) = Z~ l e^uveE°G , (1.1) 

where Z is a normalizer (the partition-function) and j3 is the inverse-temperature, here taken to be 
non-negative (ferromagnetic). These definitions extend to infinite locally finite graphs (see, e.g., [5,12]). 
The (continuous-time) heat-bath Glauber dynamics for the Ising model is the Markov chain—reversible 
w.r.t. the Ising measure ir —where each site is associated with a rate-1 Poisson clock, and as the clock 
at some site u rings, the spin of u is replaced by a sample from the marginal of tt given all other spins. 

An important notion of measuring the convergence of a Markov chain (Xf) to its stationarity measure 
7T is its total-variation mixing time, denoted f M ix(e) for a precision parameter 0 < e < 1: 

4nx(e) = inf {t : max \\P Xo (X t G ■) — 7r|| TV < e} , 

where here and in what follows P xo denotes the probability given Xo = xq, and the total-variation 
distance \\v\ — vf\rv is defined as max^cn \vi(A) — ^(A)] = \ Xlo-eU ~ z ' 2 (<z)|- 

The impact of the precision parameter e in this definition is addressed by the cutoff phenomenon —a 
concept going back to the pioneering works [1,2,4]— roughly saying the choice of any fixed e does not 
change the asymptotics of t Mlx (e) as the system size goes to infinity. Formally, a family of ergodic finite 
Markov chains (Xf), indexed by an implicit parameter n, exhibits cutoff iff t M ix(s) = (1 + o(l))i M ix(£ / ) 
for any fixed 0 < e,e' < 1. The cutoff window addresses the correction terms: a sequence w n = 
o(t M ix(l/2)) is a cutoff window if t M1 x(e) = £ M ix(1 — s) + 0(w n ) for any 0 < e < 1 with an implicit 
constant that may depend on e. That is, the Markov chain exhibits a sharp transition in its convergence 
to equilibrium, whereby its distance drops abruptly (along the cutoff window) from near 1 to near 0. 

Establishing cutoff can be highly challenging (see the survey [3] ), even for simple random walk with a 
uniform stationary measure: e.g., it is conjectured that on every transitive expander graph the random 
walk exhibits cutoff, yet there is not a single example of a transitive expander where cutoff was confirmed 
(even without transitivity, there were no examples of expanders with cutoff before [6,7]). As for Glauber 
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dynamics for the Ising model, where our understanding of n is far more limited, until recently cutoff 
was confirmed only in a few cases, with first results on lattices appearing in the works [8,9]. 

The methods used to analyze the dynamics for the Ising model on Z rf in those works had several 
caveats: the reliance on delicate features such as log-Sobolev inequalities did not cover the full high 
temperature regime in dimensions d > 3, and did not give a correct bound on the cutoff window; fur¬ 
thermore, the argument immediately broke on any geometry with exponential ball growth (expanders). 
Recently, in [10] and its companion paper [11], we introduced a new framework called information 
percolation, which does not have these limitations and can hopefully be used to analyze a wide range 
of stochastic spin systems. We demonstrated its application in analyzing the stochastic Ising model on 
Z rf up to /3 C , and to compare the effects of different starting configurations on mixing—e.g., showing 
that an initial state of i.i.d. spins mixes about twice faster than the all-plus starting state (which is 
essentially the worst one), while almost every starting state is as bad as the worst one. 

Here we demonstrate a simpler application of the method for the Ising model on a fixed-degree 
transitive graph at very high temperatures, establishing cutoff within an 0(l)-window around 

t m = inf { t > 0 : m t < l/\/n } , (1.2) 

where rrq = EX+ ( v ) is the magnetization at the origin at time t > 0 (in which X^~ denotes the dynamics 
started from all-plus). That is, t m is the time at which the expected sum-of-spins drops to a square-root 
of the volume, where intuitively it is absorbed within the normal fluctuations of the Ising measure. 

Theorem 1. For any d > 2 there exists /?o = Po(d) > 0 such that the following holds. Let G be a 
transitive graph on n vertices. For any fixed 0 < e < 1, continuous-time Glauber dynamics for the Ising 
model on G at inverse-temperature 0 < /3 < fio satisfies t M ix(e) = t m ± O e (1). 

In particular, the dynamics on a sequence of such graphs has cutoff with an 0(l)-window around t m . 

2. Basic setup of information percolation and proof of Theorem 1 

In this section we define and classify the information percolation clusters in their most basic form 
(commenting how this setup may be altered in more delicate situations such as /3 close to (3 C on Z d ), 
then reduce the proof of Theorem 1 to estimating the probability that a cluster is “red”. 

2.1. Red, green and blue information percolation clusters. The dynamics can be viewed as a 
deterministic function of Xq and a random “update sequence” of the form ( J\, U\, t\), (J 2 , U 2 ,tf), ■ ■., 
where 0 < t\ < t 2 < ■ ■ ■ are the update times (the ringing of the Poisson clocks), the Jf s are i.i.d. 
uniformly chosen sites (whose clocks ring), and the Ufs are i.i.d. unit variables (random coin tosses). 
According to this representation, one processes the updates sequentially: set to = 0; the configuration 
X t for all t G [tj-i, tf) (i > 1) is obtained by updating the site J* via the unit variable as follows: letting 
a = Xti-i( v ) denote the current sum-of-spins at the neighbors of J,, if the coin toss satisfies 

Ui < + e~P°) = \ (1 + tanh(/3fj)) (2.1) 

then the new spin at Jj is chosen to be plus and otherwise it is set to minus. 

Equivalently, one may evaluate this deterministic function backward in time rather than forward: 
sort the same update sequence {(Ji, Ui,ti )} such that t* > ti > t 2 > ■ ■where t* is a designated time 


AN EXPOSITION TO INFORMATION PERCOLATION FOR THE ISING MODEL 


3 


at which we wish to analyze the distribution (and argue it is either close or far from equilibrium). 
To construct X tk , we again process the updates sequentially, now setting to = f* and determining X t 
for all t E [tj+i, ti) in step i, where the value of the spins at the neighbors of Jj (determining a and the 
probability of ±1 in the update as above) is evaluated recursively via the suffix of the update sequence. 

Examining (2.1), this backward simulation of the dynamics can be made more efficient, since even if 
all the neighbors of the site that is being updated are (say) minus, there is still a positive probability 
of |(1 — tanh(/3d)) for a plus update: We may therefore decide to first examine Ui, and if Ui < 9 for 

9 = 6g.d := 1 — tanh(/3d) (2.2) 

we will set the new spin to plus/minus as a fair coin toss irrespective of a (namely, to plus iff Ui < 9/ 2); 
otherwise, we will recursively compute the spins at the neighbors of Jj, and set the new spin to plus iff 

0 <Ui — 9 < \{1 + tanh(/3<r) — 9) = ^(tanh(/3<j) + tanh (/3d)). 

(The right-hand is 0,1 — 9 for the extreme a = =t d, while for other values of a the rule depends on Ui .) 

We have arrived at a branching process in the space-time slab: to recover Xf k (v) we track its lineage 
backward in time, beginning with the temporal edge in the space-time slab V x [0,t*] between (v,t*) 
and (v,ti V 0), where t, is the time of the latest update to v. If no such update was encountered 
and this “branch” has survived to time t = 0, it assumes the value of the initial configuration Xq(v). 
Alternatively, an update at time ti has two possible consequences: if it features Ui < 9 —an oblivious 
update —the branch is terminated as the spin can be recovered via a fair coin toss using Up. otherwise, 
we branch out to the neighbors of v, adding spatial edges between (v,ti) and (u,ti) for all u ~ v, and 
continue developing the update histories of each of them until the process dies out or reaches time 0. 
This produces a graph in the space-time slab which we denote by and let <X? v [t) be its intersection 
with the slab V x {t} (viewed as a subset of V); further let J$?a = UueA an d se ^ analogously. 

The information percolation clusters are the connected components of the graph consisting of 

By definition, Xt k (A) is a deterministic function of the Ufs corresponding to points in X^a and 
of Xo(J£a( 0)), the initial values at the intersection of X^a with the slab V x {0}; in particular, if 
J#v( 0) = 0 then X tk is independent of Xq, and therefore its law is precisely the Ising distribution 7r 
(for instance, in that scenario we could have taken Xq ~ 7r and then Xt t ~ 7r by the invariance of it). 
However, we note that waiting for a time f* large enough such that J%v would be guaranteed not to 
survive to time t = 0 with high probability is an overkill; the correct mixing time is the point at which 
|c^v’(0)| x y/n, whence the effect of Xo on Xt k would be absorbed in the normal fluctuations of ir. 

Remark. The above defined rule for developing the update histories either terminated a lineage or 
branched it to its d neighbors. In different applications of the method, it is crucial to appropriately 
select other rules with the correct marginal of the heat-bath dynamics. 

For instance, in [11] we prove results a la Theorem 1 on any graph (including, e.g., expanders) and 
any (3 < n/d for an absolute constant k > 0—the correct dependence on d up to the value of k —by 
selecting k = 0,1 ,,d with probability pk and then deciding the new spin via a function of a uniformly 
chosen ^-element subset of the neighbors, with the probabilities pk (and the corresponding functions to 
be applied) following from a discrete Fourier expansion of the original Glauber dynamics update rule. 
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Furthermore, in [10], to analyze Z d arbitrarily close to /3 C , instead of describing such a rule explicitly 
one relies on the existence of an efficient rule thanks to the exponential decay of spin-spin correlations. 

Example. In the Ising model on 1 n , with probability 9 we assign a uniform ±1 spin, and with 
probability 1 — 9 we expose a £ {0, ±2} and select the consensus spin in case a = ±2 or a uniform 
±1 spin in case <7 = 0. Hence, an equivalent rule would be to terminate the branch with probability 
9 and otherwise to select a uniform neighbor and copy its spin, so is merely a continuous-time 
simple random walk killed at rate 9, and consists of n coalescing random walks killed at rate 9. 
The probability that J%,(0) ^ 0 (survival from time f* to time t = 0) is then exp(— 6t*), and we would 
have E\J#y(0)\ x y/n at = (29) -1 logn + 0(1), which is exactly the cutoff location. 

The key to the analysis is to classify the information percolation clusters to three types, where one 
of these classes will be revealed (conditioned on), and the other two will represent competing factors, 
which balance exactly at the correct point of mixing. In the basic case, the classification is as follows: 

• A cluster C = is Red if, given the update sequence, its final state Xt^(A) is a nontrivial 

function of the initial configuration Xq (in particular, it survives to time t = 0, i.e., J4?a( 0) ^ 0). 

• A cluster C = 3>^a is Blue if A is a singleton (A = {u} for some v £ V) and its history does 

not survive to time zero (in particular, Xt ii (A) does not depend on Xo). 

• Every other cluster C is Green. 

Let Ered = {v : 3%, C C £ Red} be the set of all vertices whose histories belong to red clusters, and let 
J£r E d = ^Vr ed be their collective history (similarly for blue/green). By a slight abuse of notation, we 
write A £ Red to denote that J^a £ Red (similarly for blue/green), yet notice the distinction between 
A £ Red and A C Vr E d (the former means that J#a is a full red cluster, rather than covered by ones). 

Remark. Various other classifications to red/green/blue can be used so that Red captures the entire 
dependence on the initial configuration and Blue forms a product-measure. For instance, in order to 
study the effect of initial conditions in [11] we let a cluster be red if at least two branches of it survive 
to time t = 0 and coalesce upon continuing to develop their history along t £ (—oo,0]; and in order to 
carry the analysis of [10] up to /3 C , the classification is more delicate, and involves the histories along a 
burn-in phase near time f* which allows one to amplify the subcritical nature of the clusters. 

Note that if {u} £ Blue then, by definition and symmetry, the distribution of Xt t (v) is uniform ±1. 
On the other hand, while the spins of a green cluster C are also independent of Xq, its spin-set at time 
t* can have a highly nontrivial distributions due to the dependencies between the intersecting update 
histories. It is these green clusters that embody the complicated structure of the Ising measure. 

Example. In the Ising model on Z n , as explained above, an information percolation cluster corresponds 
to a maximal collection of random walks that coalesced. A cluster is red if it survived to time 0 (the 
rule of copying the spin at the location of the walk guarantees a nontrivial dependency on Xo); it is 
blue if the random walk started at v dies out before coalescing with any other walk and before reaching 
time t = 0; and it is green otherwise. Observe that the sites of a green cluster at time t* all have the 
same spin—a uniform ±1 spin, independent of Xq —and the probability of u,v belonging to the same 
green cluster decays exponentially in \u — v\ (as the walks become more likely to die out than merge). 
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As the green clusters demonstrate the features of the Ising measure, it is tempting to analyze them 
in order to understand X^. The approach we will follow does the opposite: we will condition on the 
entire set of histories ^y GREEN (or M green for brevity), and study the remaining (red and blue) clusters. 

As we hinted at when saying above that if J0y(O) = 0 then ~ n r, the Ising measure tt can be 
perfectly simulated via developing the histories backwards in time until every branch terminates without 
any special exception at time t = 0. (This would be equivalent to taking f* larger and larger until jtfy 
would be guaranteed not to survive, essentially as in the ingenious Coupling From The Past method of 
perfect simulation due to Propp and Wilson [14]). Thus, we can couple X tii to tt via the same update 
sequence, in which case the distribution of Vgreen— albeit complicated—is identical (green clusters 
never reach t = 0), allowing us to only consider blue/red clusters (however in a difficult conditional 
space where percolation clusters are forbidden from touching various parts of the space-time slab). 

The second key is to keep the blue clusters — which could have been coupled to tt just like the green 
ones, as they too do not survive to time t = 0—in order to water down the effect of the red clusters 
(which, by themselves, are starkly differently under X^ and it). We will show that, conditioned on 
=^Green> the measures it and X tir are both very close to the uniform measure (and therefore to each 
other), i.e., essentially as if there were no red clusters at all and every v E V \ Vgreen belonged to 
Vblve- Indeed, conditioning on the green clusters replaces the Ising measure by a contest between blue 
and red clusters: at large t*, the effect of Red is negligible and we get roughly the uniform measure; at 
small f*, the effect of Red dominates and Xq will have a noticeable affect on W*; the balancing point 
t m has the Red clusters make an effect on Xt t . but just within the normal fluctuations of tt. 

Showing that the effect of the red clusters is negligible just beyond the cutoff location will be achieved 
via a clever lemma of Miller and Peres [13] that bounds the L 2 -distance of a measure from the uniform 
measure in terms of a certain exponential moment. In our setting, this reduces the problem to showing: 


E [2l^ EDn ^°l | Screen 


1 in probability as n —> oo , 


(2.3) 


where Pr ed 


and Rr ed are independent instances of the vertices whose histories are part of red clusters. 


Example. Recall the coalescing random walks representation for the information percolation clusters 
of the Ising model on Z n , and suppose we wanted to estimate E[2^ REC ’ nV REDl] ) i.e,, the left-hand of (2.3) 
without the complicated conditioning on J4?green- Then v E Vr E d iff its random walk survives to time 
t = 0, which has probability e~ et *. By the independence, P(u E Pred H Pr ed ) = e~ 2et * . If the sites were 
independent (they are not of course, but the intuition is still correct), then E[20 /REDn ' / RE D l] would break 
into n t >E[l + lReRREDnVjj ED }] < exp(ne~ 20i *), which for t* = (20) _1 logn + C is at most exp(e~ 2ec ) 
that approaches 1 as we increase the constant C > 0 in f*. (The actual calculation of the exponential 
moment given ^g rben , especially at very low temperatures, requires quite a bit more care.) 


The key to obtaining the bound on the exponential moment in (2.3), which is the crux of the proof, 
is estimating a conditional probability that A E Red, in which we condition not only on J£g RE en but on 
the entire collective histories of every vertex outside of A, and that A itself is either the full intersection 
of a red cluster with the top slab V x {f*} or a collection of blue singletons. Formally, let 

Ta = supP (A E Red | , {A E Red} U {A c PWe}) , 

Jr A 


(2.4) 
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where 

= {J$? v (t) : v A, t < t*} , 

noting that, towards estimating the probability of A E Red, the effect of conditioning on amounts 
to requiring that J4?a must not intersect . 

Lemma 2.1. For any d > 2 and A > 0 there exist /3q, Cq > 0 such that if f3 < /3 q then for any A C V 
and large enough n, the conditional probability that A E Red at time t* satisfies 

*A<C 0 m t + e~ xw( - A \ 

where 2IJ(A) is the size of the smallest connected subgraph containing A. 

For intuition, recall that A E Red if the histories : v E A} are all connected and Xf k {A) is a 
nontrivial function of Xq (in particular, J#a( 0) 7 ^ 0). This is closely related to the probability that for 
a single v & A we have that Xf k (v) is a nontrivial function of Xq (i.e., survives to time t = 0 and 
creates a nontrivial dependence on Xq, whence the connected component of is a red cluster). 

Indeed, the probability of the latter event is exactly 

m t* = ®X+(v) = P(X+(u) + X^(v )), (2.5) 

which explains the term in Lemma 2.1. The extra term exp(— A2IJ(A)) is due to the requirement 
that the histories of A must spatially connect, thus the projection of the cluster on V is a connected 
subgraph containing A (whose size is at least 221(A) by definition). 


2.2. Upper bound modulo Lemma 2.1. Our goal is to show that d T v(U) < £ for f* = t m + s* 
with a suitably large s* > 0, where d TY (t) = max I0 \\F xo (X t E •) — vr|| TV . By Jensen’s inequality, 
||^ — F 11 tv < I Z) ~ I Z') 11 tv] for any two distributions if and <p on a finite probability space 

12 and random variable Z. Applied with M gr EEN playing the role of Z, and letting Xq ~ it, 


d TV (t) < rnaxE ¥ xo (Xt E • | =^Green) ~ P.y t(-^t E ■ | j^Green) 


Xq 


< sup max 




XQ 


F X0 (X t € ■ \ JPg REEN ) - P x ,(X t E • | J#G reen) 


As explained above, since JQ(Vgreen) is independent of Xq we can couple it with the chain started at 
*0 ~ 7 r, whence the projection onto V \ Vg RE en does not decrease the total-variation distance, and so 


T (t) < sup 

REEN 

< 2 sup 

yJ/} 

^ Greet- 


max 

Xq 


max 

XQ 


* xo (X t (V\V G reen) E ■ | cJ^GreeN 
||P*o(W\VG reen) E ■ | =^Green 


)-F x ,(X t (V\V GB 


u v\v Gs 


^Gr 


( 2 . 6 ) 


where va is the uniform measure on configurations on the sites in A. 

We now appeal to the aforementioned lemma of Miller and Peres [13] that shows that, if a measure p 
on {±1} 1 is given by sampling a variable R C V and using an arbitrary law for its spins and a product 
of Bernoulli^) for V \ R, then the L 2 -distance of p from the uniform measure is at most K2^ RnR I — 1 
for i.i.d. copies R, R'. (See [10, Lemma 4.3] for a generalization of this to a product of general measures 
that is imperative for the information percolation analysis on at /3 near (3 C .) 
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Lemma 2.2 ([13]). Let ff = {d= 1}^ for a finite set V. For each R C V, let (pn be a measure on {±1}^. 
Let v be the uniform measure on Ll, and let /r be the measure on R obtained by sampling a subset R C V 
via some measure ft, generating the spins of R via <pr, and finally sampling V \ R uniformly. Then 

\\/i - u\\ 2 l2(u) < E2 |Rn ' R ' 1 - 1 where R, R' are i.i.d. with law ft. 


Applying this lemma to the right-hand side of (2.6), while recalling that any two measures pt and v 
on a finite probability space satisfy \\pt — p|| tv = A ||m — ^ 11 z. 1 (^) < \\\f ~ v \\ l 2 (u)i we find that 

1/2 


d T v(U) < 


sup E 

^ Green 


2l^Rj®nv aED /| | 


- 1 


(2.7) 


where Vr ed and Ur ed ' are i-i.d. copies of the variable |J{u G V : G Red}. We will reduce the 

quantity |Vr ed fl U Red /| to one that involves the a variables defined in (2.4) (which will thereafter be 
controlled via Lemma 2.1) using the next lemma, whose proof is deferred to §2.3 below. 


Lemma 2.3. Let {Ya : A C V} be a family of independent indicators satisfying P(Ya = 1) = Ta- The 
conditional distribution of Vr ed given J^green can be coupled such that 

{A : A G Red} C {A : Y A = 1} . 


The following corollary is then straightforward (see §2.3 for its short proof). 

Corollary 2.4. Let {Ia.A' : A, A' C V} be a family of independent indicators satisfying 

F(Y a .a’ = 1) = 4'a'I'a' for any A, A 1 C V . (2.8) 

The conditional distribution of (Ur E d, Ur et t) given JYqreeh can be coupled to the Ya,a> ’s such that 

|Vrbd n R Red /| A Y / \A\JA!\Ya,a' ■ 

AnA'j4 


Relaxing | A U A! | into |A| + \ A' |, we get 


sup E 

^Green 


2lWEDny RED /| | j$? Green < e ^f‘AnA'^\ A \+\ A '\) Y A,Ai = e 2^ a ^ a '^ Ya ’ a ' 

AnA'^tt 

with the equality due to the independence of the Ya a^s. By the definition of these indicators in (2.8), 


yi e j 2 (i a i + i a 'i)^.a'] < n n ((2i A i + i A, i - i)t a t^+i) < e "(^2Pi^) 2 s 


AnA'^Q 

and so, revisiting (2.7), we conclude that 


v a,A’ 

v&AgA' 


d TV (U) < ( c »(Ex 3 .^'« A ) a _ i) 1/2 A i < V2^2l A l* A , 


(2.9) 


A3v 


where we used that e x — 1 < 2x for x G [0,1]. Using Lemma 2.1 with A = log(4ed) we find that 

Y 2]AI *a < Com u Y Y 2ke ~ Xk ^ C " mt < E( 2ede ' A ) fc ^ 2C ® m u , 

A3v k A3 v k 

2 V(A)=k 
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for some Co = Co(d) > 0, and going back to (2.9) shows that d TV (i*) < 2\/2Co 

The upper bound is concluded by the submultiplicativity of the magnetization (see [11, Claim 3.3]): 

e _< mt 0 < m to+t < e _( 1 _/ 3 d)f m 4o for any t 0 ,t> 0 , ( 2 . 10 ) 

since mt m = 1 /y/n (recall ( 1 . 2 )) and so taking i* = t m + C for a suitable C(e) > 0 yields d T v(i*) <£■ ■ 

2.3. Proof of Lemma 2.3. We claim that given J^green; if one arbitrarily orders all distinct subsets 

A C V\ Vgrekn as and then successively exposes whether {Ai E Red}, denoting the associated 

filtration by Jy, then P(Aj E Red | J-)_i) < 3>a, ■ To see this, first increase P(Aj E Red | T%- 1 ) into 
P (Ai E Red | {Ai E Red}U{Aj C Vblue}; Fi-i), and then further condition on a worst-case The 

latter subsumes any information in IjA^eREn} f° r Aj disjoint from Aj; the former means that A, is either 
the full intersection of a red cluster with V x {f*} or a collection of blue singletons, either way implying 
that any Aj intersecting A t must satisfy Aj Red. Altogether, the events {Aj E Red : j < i} are 
measurable under the conditioning on {Aj E Red} U {Aj C Vblue an< ^ we arrive at ^A,, which 
immediately implies the stochastic domination. ■ 

Proof of Corollary 2.4. By Lemma 2.3 we can stochastically dominate {A E Red} and { A ' E Red'} 
by two independent sets of indicators {LA} and {Y^}. Let {(A;, A()};> i denote all pairs of intersecting 
subsets (A, A' C V \ Vgreen with An A' / 0) arbitrarily ordered, and associate each pair with a variable 
Ri initially set to 0. Process these in order: If we have not yet found Rj = 1 for some j < l with either 
Aj n Aj / 0 or A'- n A' ± 0, then set R t = I^gRed, A'gRed'} (otherwise skip this pair, keeping Ri = 0). 

Noting P(i?/ = 1 | Ti- 1 ) < P(Ya, = 1, Y'., = 1) = (as testing Ri = 1 means we received only 

negative information on {Y^ = 1} and {Y(, = 1}) gives the coupling {l : Ri = 1} C {l : Ya^A' = !}■ 
Completing the proof is the fact that if v E Yr E d H Yr ed ' then there are subsets Ai , A] containing it 
such that Ya l and Y^,, in which case either Ri = 1 or we will have an earlier Rj = 1 for a pair involving 
Aj = Ai or A) = A[ (nontrivial intersections with Ai or A] will not be red), whence v € Aj U A). ■ 

2.4. Lower Bound. Recalling the choice of t m as the point such that rrij m = 1 /\fn, we let the sum of 
spins f(a) = Yl v a ( v ) be our distinguishing statistic at time t~ = f m — s*. Putting Y = f[X~ t) for the 
dynamics (AT t + ) started from all-plus, by ( 2 . 10 ) we have 

EY = nm t - > e 2 ( 1 ^ d ) s *mi m > e s *y/n (2.11) 

(the last inequality using /3d < ^). For the variance estimate we use the fact (see [11, Claim 3.4]) that 
for some constants /3o = f3o{d) > 0 and 7 = 7 (d) > 0, if /3 < (3q then 

Cov(X t (u), X t (v)) < 7 for any Xq, t > 0 and v E V . (2.12) 

U 

From this inequality it follows that Var (Y) < yn, and in light of (2.11), P(Y > EY/2) > 1 — e/2 by 
Chebyshev’s inequality provided s* = s*(e) is chosen large enough. 

On the other hand, if X' ~ n then E [f(X')\ = 0 (as EpO(u)] = 0 for any v ), while Var (f(X')) < 771 
by the decay of correlation of the Ising measure. By Chebyshev’s inequality, for any large enough 
s* = s*(e) we have F(f(X') > EY/2) < e/2. Altogether, d TV (t ~) > 1 — e, as required. ■ 


AN EXPOSITION TO INFORMATION PERCOLATION FOR THE ISING MODEL 


9 


3. Analysis of the information percolation clusters 

The delicate part of the bounding is of course the conditioning: since A is either the interface 
of a complete red cluster, or a collection of blue singletons, its combined histories must avoid Jiff- 
This immediately implies, for instance, that if any branches of Jff should extend to some point 
(v, t) E A x [0, i*) in the space-time slab, necessarily the branch of v must receive an update along the 
interval (t, t*] to facilitate avoiding that branch. Our concern will be such a scenario with t E (i* — 1, f*], 
since for values of t extremely close to i* this event might be extremely unlikely (potentially having 
probability smaller than exp[—0(|A|)], which would we not be able to afford). For a subset A' C A 
and a sequence of times {s„} ne ^/ with s u E (f* — l,f*], define 

U = U(A {s u } ue ^/) = P| {u receives an update along (s u ,i*]} . (3-1) 

uGA' 

We will reduce the conditioning on Jtfff to an appropriate event U, and thereafter we would want to 
control M’a-, the collection of all histories from A, both spatially, measured by its branching edges 

x(J^a) = #{((u,t),(v,t)) E J?a : uv E E(G ), t E (0,4]} 

(i.e., those edges that correspond to a site branching out to its neighbors via a non-oblivious update), 
and temporally, as measured by the following length quantity: 

rU 

£(J#a) = 'Y1 1 { (u,t)GJff A } dt ■ ( 3 - 2 ) 

uGV^° 

The following lemma bounds an exponential moment of x(J^a) an d £(J#a) under any conditioning on 
an event U as above, and will be central to the proof of Lemma 2.1. 

Lemma 3.1. For any d > 2, 0 < < 1 and A > 0 there are /3o(d, 77 , A) > 0 and a(d, rj) > 0 such that 

the following holds. If /3 < /3 q then for any subset A, 

supE [exp (t/£(J£a) + A x(^a)) | U\ < exp (a|A|) , 

IA 

where the event U is as defined in (3.1). 

3.1. Proof of Lemma 2.1. For a given subset S C V, let RedI) denote the red clusters that arise when 
exposing the joint histories of Jt?s (as opposed to all the histories Jify), noting the events {A E Red} 
and {A E RED^jnj^in = 0} are identical (so that A would be the interface of a full red cluster). 
Similarly define Blue)}, and by the same reasoning {A C Fblue} = {A C Vblue^} FI {j^a FI = 0}. 

Next, given = A, let s u = s u (A) = max{s : (u,s) E X} be the latest most time at which 
X contains u E A, and recall from the discussion above that any u with s u < f* must receive an 
update along (s u ,f*j in order to avoid X. Thus, writing A' = {u E A : s u > t* — 1} and defining 
U(A > , {sulugA') as in (3.1), we find P (A E Red | Jifjf = X, {A E Red} U {A C Vblue}) to be equal to 

P(A E Red)} , Jt? A nX = D,U \ = X) 

P({A E Red^} U {A C Fblue^} , Jf A n A = 0 , U | Wjf = X) ’ 
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which, since both of the events A E Red^ and A C Vblue^ are J^A-measurable, equals 

P {A E Red^ , n X = 0 I U) 

P({^ e Red^} u {A c Rblue^} , ^ n A = 0 I U) ■ 

The numerator is at most P(A £ Red^ | U). As for the denominator, it is at least the probability that, 
in the space conditioned on U , every u E A gets updated in the interval (s u V f* — 1, f*] and the last 
such update (i.e., the first we expose when revealing JY U ) is oblivious (implying A C Lblue^)— which 
is Q\ a \(1 — l/e)l A \ A, L As this is at least e - ^ for small enough /? (recall the definition of 6 in (2.2)), 

< e |A| P(A E Red a | U) . (3.3) 


To estimate P(A E Red^ | U), let us expose the history of A backwards from i* along the first unit 
interval (f* — 1, i*] (moving us past the information embedded in U ), then further on until reaching time 
T where J#a(T) coalesces to a single point or we reach time T = 0. For A E Red^ to occur, in either 
situation the projections of and X^ on the subset J#a(T) must differ (otherwise Xt*(A) will not 

depend nontrivially on Xo). If T = 0 this trivially holds, and if T > 0 then, as JYa(T) = {( w,T )} for 
some vertex w and we did not expose any information on the space-time slab V x [0, T ], the probability 
of this event is exactly my. Furthermore, using (2.10) we have my < e t * _r m^, and by definition, 
along the interval [T, f* — 1] there are at least two branches in ^(4, so £(J#a) > 2(f* — 1 — T); thus, 
tnr < e 2- c ('^k)+ 1 m^. Also, on the event A E Red^ the histories JYa nrust all join by time T, and thus 
x(^a) > 2U(A) — 1 since YYa must observe at least that many branching edges for connectivity. 

In conclusion, for any v £ A, 


E Red^ | U) < em^E l{x(JC4.)>2B(A)-i} e2 ’ C ^ A ' ) I ^ 
which for any A > 0 and a > 0 is at most 


,1-(A+q+1)(2H(A)-1) 




e (A+a+l)x(^A)+|-2(^vl) I yi 


Plugging in a as given by Lemma 3.1 (which we recall does not depend on the pre-factor of x(J%) in 
that lemma), the exponential moment above is at most e a][A \ and revisiting (3.3) we conclude that 

< tTp e 1+ ( 1+a )l A l _ (^+ q +1)(2®(^) — 1) < gA+a+2 g -A2tJ(A) . f 


3.2. Exponential decay of cluster sizes: Proof of Lemma 3.1. We develop the history of a set A 

backward in time from i* by exposing the space-time slab. Let W s = | J%A(t* — s)| count the number of 
vertices in the history of A at time t* — s, let Y s = %( JPa HP x [f*-s,f*]) be the total number of vertices 
observed by the history by time t* — s and let Z s = ^2 ueV Jl*_ s 1 the total length of the 

history in the time interval [f* — s, i*] of the space-time slab. Initially we have (Wo, Yoi Zo) = (|A|, 0,0). 

Recall that the probability that an update of a vertex v will branch out to its d neighbors is 1 — 9 
and that with probability 6 it is oblivious which observes no neighbours. Thus we can stochastically 
dominate (W s . Y s . Z s ) by a process (W S ,Y S , Z s ) defined as follows. Initially, (Wo, Yq. Zq) = (|A|,0,0) 
and at rate 6W S we decrease W s by 1 and at rate (1 — 9)W S both W s and Y s increase by d. The length 
grows as dZ s = W s ds. Now consider the process, 

Q s = exp ( rjZ s + A Y s + aW s ) . 
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for some a > — log(l — rj) which does not depend on A. We have that 
d 


^[Qs | Q S0 ] | = (r? + 9{e~ a - 1) + (1 - 9)(e^ d - 1)) W Sq Q Sq 

which is negative provided 9 is sufficiently close to 1 (guaranteed by taking do sufficiently small). Then, 
letting r be the first time that W T = 0, optional stopping for the supermartingale Q s yields 

IE exp ( r]Zs + A Y s ) < EQ 0 = exp(ct|A|). 

By the stochastic domination we have that E [exp (r/£(J^4) + A < exp(a|A|). Under this 

coupling the effect of conditioning on U is simply to expedite updates and hence reduce the length of 
the process, thus the conditioning can only reduce the expectation. ■ 
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